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Understanding the factors that explain differences in survival 
times is an important issue for establishing policies to improve na¬ 
tional health systems. Motivated by breast cancer data arising from 
the Surveillance Epidemiology and End Results program, we pro¬ 
pose a covariate-adjusted proportional hazards frailty model for the 
analysis of clustered right-censored data. Rather than incorporating 
exchangeable frailties in the linear predictor of commonly-used sur¬ 
vival models, we allow the frailty distribution to flexibly change with 
both continuous and categorical cluster-level covariates and model 
them using a dependent Bayesian nonparametric model. The result¬ 
ing process is flexible and easy to £t using an existing R package. 

The application of the model to our motivating example showed that, 
contrary to intuition, those diagnosed during a period of time in the 
1990s in more rural and less affluent Iowan counties survived breast 
cancer better. Additional analyses showed the opposite trend for ear¬ 
lier time windows. We conjecture that this anomaly has to be due 
to increased hormone replacement therapy treatments prescribed to 
more urban and affluent subpopulations. 


1. Introduction. Based on data gathered for Iowa State in the Surveil¬ 
lance Epidemiology and End Results (SEER) program of the National Can¬ 
cer Institute, we assess the effect of potential risk factors for womens’ breast 
cancer. This involves the analysis of clustered time-to-event right-censored 
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data, where event times of patients from the same county of residence are ex¬ 
pected to be associated with each other, possibly due to sharing common un¬ 
observed characteristics, such as region-specific differences in environments, 
treatment resources or diagnosis of the patients. As is widely known, tak¬ 
ing into account the clustered nature of the data is a must to obtain valid 
statistical inferences [see, e.g., Therneau and Grambsch (2000), Chapter 8]. 

A standard way of modeling clustered survival data is to introduce a com¬ 
mon random effect (frailty) into the survival model for each cluster, yielding 
shared frailty models. “Frailties,” termed by Vaupel, Manton and Stallard 
(1979), were originally introduced to deal with possible heterogeneity due to 
unobserved covariates and are regarded as unobserved common characteris¬ 
tics for each cluster able to account for the dependence among event times. 
In the context of the proportional hazards (PH) model, as conventionally 
implemented, frailties are incorporated into the linear predictor, and the 
median or mean of the frailty distribution is constrained to be zero to avoid 
identifiability problems. Conditional on the frailty, the model retains its in¬ 
terpretation in terms of constants of proportionality of the hazards. Survival 
models with frailties have been extensively used in the statistical literature, 
especially when the comparison of event times within cluster is of interest. 

A common assumption in shared frailty survival models is the one of 
homogeneity, where the frailties are assumed to be independent and iden¬ 
tically distributed (i.i.d.) random variables from a parametric or nonpara- 
metric distribution [see, e.g., Clayton and Cuzick (1985), Gustafson (1997), 
Qiou, Ravishanker and Dey (1999), Walker and Mallick (1997)]. Although 
the nonparametric approach provides flexibility in capturing a frailty distri¬ 
bution’s variance, skewness, shape and even modality, it essentially assumes 
that these frailty distributional aspects are the same across all the clus¬ 
ters, which may be restrictive for particular data sets [Noh, Ha and Lee 
(2006)]. For example, in the kidney transplantation study, Liu, Kalbfleisch 
and Schaubel (2011) argue that the frailty distribution may be affected by 
some cluster-level covariates, since “... urban transplant facilities may ex¬ 
hibit more uniform practices than rural transplant hospitals, corresponding 
to less heterogeneity (smaller variance) for frailties of urban centers. .. ” 
Ignoring such heterogeneity can drastically affect the inference for cluster- 
specific effects and prediction [McCulloch and Neuhaus (2011)]. 

As the process generating the frailty terms is on its own right of scientific 
interest, different extensions of the i.i.d. frailty modeling approach have been 
considered. Wassell and Moeschberger (1993) studied the impact of inter¬ 
ventions in the Framingham Heart Study by introducing a modified gamma 
frailty with a pairwise covariate-dependent parameter. Yashin and lachine 
(1999) considered the dependence between frailty and observed covariates 
(BMI and smoking) in Danish twins to investigate the heritability of suscep¬ 
tibility to death. Noh, Ha and Lee (2006) verihed frailty distribution hetero¬ 
geneity in a well-known kidney infection data set by applying a dispersed 
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normal model. Cottone (2008) assumed either Bernoulli or normal distribu¬ 
tions for the frailties where the frailty distribution mean or variance depends 
on cluster-level covariates through specified link functions. Liu, Kalbfleisch 
and Schaubel (2011) proposed a covariate-dependent positive stable shared 
frailty model with an application to kidney transplantation data from the 
Scientific Registry of Transplant Recipients, and demonstrated the hetero¬ 
geneity in facility performance. Wang and Louis (2004) studied a related 
approach for binary data that has both conditional and marginal interpre¬ 
tation using the so-called bridge distribution instead of positive stable. 

The previously described model extensions allow for particular and spe¬ 
cific aspects of distributional shape to change with cluster-level covariates. 
However, a more thorough evaluation of the effect of the predictors should 
account for potential changes in characteristics of the frailty distribution 
other than just, for example, the location or scale. It is, for instance, useful 
to examine potential changes in the skewness, symmetry and multimodal¬ 
ity of the frailty distribution. Therefore, a nonparametric formulation that 
anticipates changes in shape, skew and modality beyond simple location 
models is of interest. 

In this paper, we propose a practicable and general framework for model¬ 
ing clustered survival data as a function of covariates, based on a predictor- 
dependent Bayesian nonparametric model for the frailties and the Cox’s 
PH model. Under the proposed approach the frailty distribution flexibly 
changes with both continuous and categorical cluster-level covariates, thus 
allowing for full heterogeneity across clusters. We apply this modeling ap¬ 
proach to a subset of the SEER county-level breast cancer data consisting 
of 1073 women diagnosed with malignant breast cancer during 1995-1998. 
Important patient-level covariates include age at diagnosis, race, county of 
residence and the stage of the disease. Additional county-level covariates po¬ 
tentially associated with breast cancer survival are also available from census 
data, including median household income, poverty level, education and a ru- 
rality measure. These area-level socioeconomic factors have been discovered 
to be associated with breast cancer by many researchers [e.g., Sprague et al. 
(2011)]. Women living in more affluent or less rural geographic areas tend 
to survive breast cancer better after a diagnosis than those living in regions 
with indicators of low socioeconomic status. Moreover, rural counties may 
present more heterogeneity in access to quality care and screening for breast 
cancer, leading to more variability for frailties of rural counties [Zhao and 
Hanson (2011)]. This suggests to us that the frailty distribution could be 
potentially affected by these county-level socioeconomic factors. The results 
show that the proposed model provides better goodness of fit to the data 
and is predictively superior to the traditional PH spatial frailty model, as 
well as helping to piece together a plausible story for the data in terms of 
the prescribing of hormone replacement therapy. 
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The paper is organized as follows. In Section 2 we introduce the pro¬ 
posed frailty PH model, including a detailed description of the dependent 
Bayesian nonparametric model and the Markov chain Monte Carlo (MCMC) 
implementation of the posterior computations. Section 3 provides a detailed 
analysis of the motivating data set. Section 4 presents the results of sim¬ 
ulation studies to evaluate the performance of the proposed model. Some 
concluding remarks and a final discussion are given in Section 5. 

2. Covariate-adjusted frailty proportional hazards model. 

2.1. The modeling approach. Suppose that right-censored survival data 
are collected for the jth subject of the fth cluster, where j = 
1,..., n*, z = 1,..., n, Wij is a p-dimensional vector of exogenous covariates, 
tij is the recorded event time, and 5ij is the censoring indicator equaling 1 if 
tij is an observed event time and equaling 0 if the event time is right-censored 
at tij. Let Tij and Cij be the event and censoring times, respectively, for 
the jth. subject in the zth cluster. To take into account the within-cluster 
association structure, a frailty PH model is assumed for Tij. The conditional 
PH assumption implies that the hazard function of Tij is given by 

(1) X{t\wij,ei) = Xo{t) exp(wL^ a), 

where e = (ei,..., CnY is an unobserved vector of frailties, and Ao(t) is the 
baseline hazard function corresponding to the event time of a subject with 
covariates w = 0 and e = 0. We additionally assume a conditionally inde¬ 
pendent censoring scheme, that is, Cij and Tij are independent given Wij 
and Bj. Often the frailties are assumed to be exchangeable or i.i.d. from 
some parametric or nonparametric distribution G. For instance, Therneau, 
Grambsch and Pankratz (2003) considered exchangeable Gaussian frailties 
and proposed an estimation procedure based on a Laplace approximation 
of the likelihood function leading to a penalized partial likelihood. This ap¬ 
proach, referred to below as GF, will be compared with our method in the 
simulation studies. 

Now consider a partition of the predictor vector Wij = (wL,x()', where 
Xj G T C R'J is a g-dimensional vector of cluster-level covariates and Wjj is a 
(p — ( 7 )-dimensional vector of subject-specific covariates, respectively, and the 
corresponding partition of the regression coefficient vector ^ On 

the scale of the linear predictor -|- e^, the frailty Ci models the cluster- 
specihc behavior and its distribution G is shifted by Therefore, the 

homogeneity assumption implies that the vector of cluster-level covariates 
Xj modifies only the location of the distribution of cluster-specific effects but 
not its shape. To relax this assumption, we consider a covariate-adjusted 
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frailty PH model, where the frailty distribution depends on cluster-level 
covariates Xj. That is, 

eilGx, 

where for every x G T”, Gx is a probability measure defined on M; this spec¬ 
ifies a probability model for the entire collection of probability measures 
= {Gx:x € -T}, such that its elements are allowed to smoothly vary 
with the cluster-level covariates x. Specifically, we consider a mixture of lin¬ 
ear dependent tailfree processes (LDTFP) prior [Jara and Hanson (2011)] 
for , denoted as 

g^\J,h,e,c,pr~^ LDTFP(/i, H*^'^, , 

and 


c\Q ~ Q, 

where J G N is the level of specification of the process, c G M+ is a prior 
precision parameter controlling the prior variability of the process, h{-) = 
i+^xp{.} ; is a J-level sequence of binary partitions of M, depending on 

the scale parameter 9 G M+, = {2n/cp{l ),..., 2n/cp{J)} is a collection 

of positive numbers depending on J, c and p, p: N —M+ is an increasing 
function, and Q is a probability measure defined on M+. 

The LDTFP is specified such that for every x G T, the process Gx is 
centered around an N{0,9) distribution, that is, E{G:A} = -^(0,0), for every 
X G <T. Furthermore, the process is specified such that for every x G T, Gx 
is almost surely a median-zero probability measure. The latter property is 
important to avoid identifiability problems. The LDTFP process includes as 
important special cases a nonparametric exchangeable frailty model where 
Gx = Gx' for x' 7 ^ X as well as exchangeable normal frailties Gx = 9) 

for all X G T". 

As shown by Jara and Hanson (2011), dependent tailfree processes have 
appealing theoretical properties, such as continuity as a function of the pre¬ 
dictors, large support on the space of conditional density functions, straight¬ 
forward posterior computation relying on algorithms for fitting generalized 
linear models, and the process closely matches conventional Polya tree pri¬ 
ors [see, e.g., Hanson (2006a)] at each value of the predictor, which justify 
its choice here. Polya trees have been extensively studied in the literature 
and have desirable properties in terms of support and posterior consistency. 
Details on the trajectories of LDTFP(h, H"^’®, A'^’'^’^), useful for a complete 
implementation of algorithms for exploring the corresponding posterior dis¬ 
tributions, are given in Appendix A of the supplementary material [Zhou 
et al. (2015)]. 
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Other dependent processes could be considered for ^ but a highly lim¬ 
iting requirement is that some aspect of the location, for example, mean or 
median, can be fixed. There are few examples where the process changes 
smoothly with covariates; one is the multivariate beta process of Trippa, 
Muller and Johnson (2011). Another approach using Dirichlet process mix¬ 
tures can be found in Reich, Bondell and Wang (2010), but this latter ap¬ 
proach would have to be extended to allow the means or variances of the 
two mixture components to change with covariates. 

2.2. Posterior computation. The conditional likelihood for (^,Ao,e) is 
given by 

n Ui 

m, Ao, e) = ]j[Ao(tij) exp(wb^ -k exp{-Ao(tij) exp(wb4 -k e*)}, 

i=lj=l 

where Ao(t) = /g Ao(s)ds is the cumulative hazard function. The piecewise 
exponential model provides a flexible framework to deal with the baseline 
hazard [see, e.g., Walker and Mallick (1997)]. We partition the time period 
M+ into K prespecified intervals, say, /^ = (afc_i, Ofc], A: = 1,..., AT, where 
oq = 0 and ax = max{Aj}. The baseline hazard is assumed to be constant 
within each interval, that is, 

K 

Ao(A) = ^2 £ h}, 

k=l 

where Ai,..., Xx are unknown hazard values and I {A} is the usual indicator 
function, that is, 1 when A is true, 0 otherwise. The prior hazard is speci¬ 
fied by the hazard values and cut-point vector a = (oi,... ,ax)- If 

the prior on the A^’s is taken to be independent gamma distributions and 
{Ik}k=i ^ reasonably fine mesh, the gamma process [Kalbfleisch (1978)] 
is approximated. To determine the cut-point vector a, one can set to 
be the -^th quantile of the empirical distribution of the Aj’s, or choose 
them based on other considerations (see Section 3.2). Some authors have 
considered random cut-points [see, e.g., Sahu and Dey (2004)]. Regard¬ 
less, the resulting model implies a Poisson likelihood [Laird and Olivier 
(1981)] as follows. Let K{t) = m.m{k:ak > t}, Afc(f) = min{afc,f} — a^-i, 
and yijk = Sijl{k = K{tij)}. Set Zijk = {i^'k^^'ijY and 7 = (A',^')'> where 
tfc is a iL-dimensional vector of zeros except the /cth element is 1 and 
A = (log(Ai),... ,log(Ai^))'. Then the likelihood for ( 7 , 0 ) becomes 

n rii 

^(7,e) = ]j[exp{log(A;^(i,.)) + wL^ + 

i=lj=l 
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X g-exp{log(Aj;)-|-wC^-|-ei}Aj;(Lj) 

. k=l 

n rii K {tij ) 

i=lj=l k=l 

n rii K {tij) 

ociiii n p{yijkh,ei), 

i=lj=l k=l 

where ^ijk = exp{z ^^7 + Si + log{Ak{tij))} and p{yijkh,ei) is the proba¬ 
bility mass function for a Poisson distribution with mean ^ijk- For each 
z = 1 ,..., n, let iVj = X]j=i iF(tjj), = {yijk) be an x 1 vector with sub¬ 
script ijk in lexicographical order. 

Thus, the proposed covariate-adjusted frailty PH model takes the follow¬ 
ing hierarchical structure: 

rii ^{^ij) 

y*l7,ei -'H n p{yijkh,ei), 

j=l k=l 

7 ~ iV_K+p(7o,So), 

g^\j,h,e,c,p ~ LDTFP(/i,n-^’^^‘^’"’0> 

9~'^ ~ F(Ti,r 2 ), c~F(ac,6c), 

which largely simplifies computations, where A^p(m, S) refers to a p-variate 
normal distribution with mean m and covariance matrix S. This forms the 
basis of an efficient Markov chain Monte Carlo (MCMC) scheme for obtain¬ 
ing posterior inference, which can be implemented using available software 
for generalized linear mixed models. A full description of the MCMC al¬ 
gorithm is given in Appendix B of the supplementary material [Zhou et al. 
(2015)]. Sample R code using the LDTFPglmm function available in DPpackage 
[Jara et al. (2011)] is provided in Appendix C of the supplementary material 
[Zhou et al. (2015)]. 

Time-dependent subject-specific covariates that are step-processes [Han¬ 
son, Johnson and Laud (2009)] are naturally accommodated by including 
the times where the covariate values change across all subjects into the cut- 
point vector a. All that is changed above is Zjjk = Wk^'^ijkY^ '^ij 

replaced with its time-varying analogue Wjjfc. Similarly, time-varying regres¬ 
sion effects can be included by replacing zC ^7 with zC^ 7 ;. in Pijk, yielding 
very general models. The proposed model implies exchangeable frailties for 
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each subgroup with a unique x £ A”. Time-dependent cluster-specific covari¬ 
ates are therefore naturally included in the model by simply allowing x to 
change with time. For example, in the SEER data set analyzed over a larger 
time window, for subjects living in the ith county, one could include into x* 
the median house income of that county at their particular diagnosis year. 
Furthermore, the frailty distribution can itself evolve in time by simply in¬ 
cluding time as a covariate in x, or a time-by-cluster covariate interaction 
could also be entertained. 

3. Analysis of SEER county-level breast cancer data. 

3.1. The Iowa SEER data. The SEER program of the National Cancer 
Institute (see http://seer.cancer.gov/) is an authoritative source of in¬ 
formation on cancer incidence and survival in the US, providing county-level 
cancer data on an annual basis for particular states for public use. We fit 
our proposed covariate-adjusted frailty Cox’s PH model to a subset of the 
Iowa SEER breast cancer survival data, which consists of a cohort of 1073 
women from the 99 counties of Iowa, who were diagnosed with malignant 
breast cancer in 1995, with enrollment and follow-up continued through the 
end of 1998. The observed survival time, from 1 to 48, was calculated as 
the number of months from diagnosis to either death or the last follow¬ 
up. In our analysis, only deaths due to metastasis of cancerous nodes in the 
breast were considered to be events, while the deaths from other causes were 
censored at the time of death. That is, we consider cause-specific survival 
models assuming that all other deaths are independent of breast cancer. By 
the end of 1998, a total of 488 patients (45.5%) had died of breast cancer, 
while the remaining 585 patients were censored, either because they died of 
other causes or survived until the last follow-up. 

For each patient, the observed survival time and county of residence at 
diagnosis are recorded. The data set also has individual-level covariates 
including age at diagnosis and the stage of the breast cancer: local (con¬ 
fined to the breast), regional (spread beyond the breast tissue), or distant 
(metastasis). We create two dummy variables for regional and distant, re¬ 
spectively, and treat the patients in the local group as the baseline. Al¬ 
though several individual-level covariates that affect breast cancer survival 
are not available (e.g., age at first child, age at menopause and breastfeed¬ 
ing), we are able to obtain county-level covariates potentially associated 
with breast cancer survival from census data, such as median household in¬ 
come (small area estimates in 1993), poverty level (percentage of families in 
poverty in 1990), education (percentage with Bachelor’s degree or higher in 
1990) and rurality (Rural-Urban Continuum Codes in 1993). The Economic 
Research Service Rural-Urban Continuum Codes (RUCC) vary from 1 to 
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Table 1 

Iowa SEER data: Summary statistics for follow-up times and both individual- and 

county-level covariates 


Continuous variables 

Minimum 

Median 

Maximum 

Follow-up time in months 

1 

19 

47 

Age in years 

26 

72 

103 

RUCC 

2 

7 

9 

Income (xlOOO) 

20.627 

29.110 

39.356 

Categorical variables 

Level 

Count 

Proportion (%) 

Status 

Event 

488 

45.5 


Censored 

585 

54.5 

Stage 

Local 

510 

47.5 


Regional 

355 

33.1 


Distant 

208 

19.4 


9 (see www.ers.usda.gov/data-products/rural-urban-continuum-codes), dis¬ 
tinguishing metropolitan counties by the population size of their metro area 
and nonmetropolitan counties by degree of urbanization and adjacency to a 
metro area. Higher RUCC indicates a more rural county. Other county-level 
covariates mentioned above are available at http: //data. iowadatacenter. 
org/browse/counties .html. Since the effects of education and poverty on 
the survival times are not significant based on our initial model fitting by the 
proposed method, we exclude them in the analysis presented below. Thus, we 
have three-dimensional w^- and two-dimensional x*. Table 1 presents several 
summary statistics for the data. As shown in Figure 1, median household 
income and RUCC are significantly, negatively correlated. 

To get an initial feeling about the role that each county-level covariate 
is playing. Table 2 provides the distribution of each county-level covari¬ 
ate stratified by the individual-level stage of disease. The gamma statistic 
(GK), originally proposed by Goodman and Kruskal (1954), is calcnlated to 
quantify the association between each county-level covariate and the stage 
of disease. The GK values range from —1 (100% negative association) to 
1 (100% positive association), where the value 0 indicates no association. 
We see that women with a distant-stage at diagnosis are much more likely 
than those with a local-stage to live in counties with a high degree of urban¬ 
ization (GK = —0.11; 95% Cl: from —0.20 to —0.01), while the association 
between stage and income is not significant (GK = 0.04; 95% Cl: from —0.06 
to 0.13). These associations roughly imply that women living in urban coun¬ 
ties may suffer poorer survival, assuming that women in distant-stage are 
more likely to die than women in other stages. Next, we carefnlly examine 






10 


ZHOU, HANSON, JARA AND ZHANG 




RUCC 

(a) (b) 


Fig. 1. Iowa SEER data: panel (a) shows the scatter plot and simple linear regression line 
hy regressing median household income on RUCC. Panel (b) shows the baseline hazards 
for Model 1. The dashed line corresponds to Breslow’s estimate of Xo{t) obtained by the 
CF approach, where the circles represent the hazard values at each month; the solid line 
is the fitted baseline hazard by our approach, where the solid squares correspond to the 
cut-point values a= (1,11,16,17,19,20,25,28,29,36,40,44,47). 


both these individual-level and county-level covariates in relation to breast 
cancer survival, fitting the covariate-adjusted frailty proportional hazards 
model. 


Table 2 

Iowa SEER data: Distribution of each county-level covariate stratified by individual-level 
stage. The pattern of numbers is Number of women (%). Goodman and Kruskal’s gamma 
statistics (95% confidence intervals) are —0.11 (-0.20,-0.01) and 0.04 (—0.06,0.13) for 

RUCC and Income, respectively 





Stage 


Covariates 

All women 

N = 1073 

Local 

N = 510 

Regional 

N = 355 

Distant 
N = 208 

RUCC 

1-3 

314 (29.3) 

131 (25.7) 

99 (27.9) 

84 (40.4) 

4-7 

666 (62.1) 

342 (67.1) 

221 (62.3) 

103 (49.5) 

8-9 

93 (8.6) 

37 (7.2) 

35 (9.8) 

21 (10.1) 

Income (xlOOO) 
20-27 

163 (15.2) 

79 (15.5) 

51 (14.4) 

33 (15.9) 

27-34 

651 (60.7) 

312 (61.2) 

223 (62.8) 

116 (55.8) 

>34 

259 (24.1) 

119 (23.3) 

81 (22.8) 

59 (28.3) 
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3.2. Models and model comparison. We fitted the proposed covariate- 
adjusted frailty PH model for the Iowa SEER data with different county- 
level covariates, including models with RUCC only (Model 1), with median 
household income only (Model 2) and with both (Model 3). To see how 
the piecewise assumption of baseline hazard affects the predictive ability of 
models, we considered three specifications of cut-point vector a as follows: 

Case I. a = (1,11,16,17,19,20,25,28,29,36,40,44,47), which was deter¬ 
mined by visually examining Breslow’s estimate of Xo{t) using the GF ap¬ 
proach, which is given in panel (b) of Figure 1. 

Case II. a= (3,7,12,16,19,24,29,34,41,47), where at is the ^th quan¬ 
tile of the empirical distribution of observed survival times. 

Case III. a= (47), which yields an exponential baseline hazard. 

In all cases, we set J = 4. We fitted all the models using the corresponding 
variants of the algorithm described in Appendix B of the supplementary 
material [Zhou et al. (2015)] and similar prior specifications suggested in 
the simulation study. The Markov chain mixed reasonably well despite the 
high dimension of our models. For each version of our model and case, we ran 
a single Markov chain of 1,020,000. A total number of 20,000 were discarded 
as burn-in period and 10,000 samples were retained for posterior inference. 
Moreover, we also considered another case II with 13 cut-points and cut- 
point specifications based on the event time quantiles from the Kaplan-Meier 
curve in Appendix E of the supplementary material [Zhou et al. (2015)]. The 
results show that carefully choosing the cut-points is more important than 
simply increasing the number of cut-points. 

For the sake of comparison, we further fitted the exchangeable MPT frailty 
Cox model and the Bayesian exchangeable Gaussian frailty Cox model. 
We compare the models using the log pseudo marginal likelihood (LPML) 
developed by Geisser and Eddy (1979) and the deviance information cri¬ 
terion (DIG) proposed by Spiegelhalter et al. (2002). In the context of 
the frailty Cox model, the LPML for model M is defined as LPML = 
ZlILi SJii where CPOjj, the ijih conditional predictive or¬ 

dinate, is given by [A(tij)'^*-’e“^^*'-’'^]P(jj)] with denoting the remaining 
data after excluding the ijth data point Vij. One can use the simple method 
suggested by Gelfand and Dey (1994) to estimate the CPO statistics from 
MCMC output. A larger value of LPML indicates the corresponding model 
has better predictive ability. Furthermore, Geisser and Eddy (1979) dis¬ 
cussed the exponentiated difference in LPML values from two models to 
obtain what they termed as a pseudo Bayes factor (PBF). The PBF is a 
surrogate for the more traditional Bayes factor and can be interpreted simi¬ 
larly, but is more analytically tractable, much less sensitive to prior assump¬ 
tions, and does not suffer from Lindley’s paradox. Set ft = (e,7,/3,0) as the 
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Table 3 

Iowa SEER data: Deviance information criteria (DIG) and log of the pseudo marginal 
likelihood (LPML) for models under consideration 


Model 

Frailty 

Case I 

Case II 

Case III 

Die 

LPML 

Die 

LPML 

Die 

LPML 

1 

LDTFP 

4436 

-2222 

4463 

-2234 

4495 

-2247 


MPT 

4441 

-2225 

4463 

-2235 

4496 

-2248 


Gaussian 

4444 

-2225 

4467 

-2236 

4497 

-2248 

2 

LDTFP 

4441 

-2224 

4465 

-2235 

4498 

-2249 


MPT 

4440 

-2225 

4462 

-2236 

4497 

-2248 


Gaussian 

4443 

-2225 

4465 

-2235 

4498 

-2249 

3 

LDTFP 

4438 

-2223 

4464 

-2235 

4496 

-2248 


MPT 

4441 

-2225 

4464 

-2235 

4498 

-2249 


Gaussian 

4445 

-2226 

4467 

-2236 

4498 

-2248 


entire collection of model parameters. The DIG for model M is defined as 
DIG = D + pd = Efi^x>{D{fl)} + pd, where D{fl) = —21ogT(7,e) which is 
referred to as the deviance function, and po = D — which is a 

measure of model complexity. Note that the DIG is also readily computed 
from MGMG output. 

3.3. Results. Table 3 shows the DIG and LPML for all models under 
consideration. All models under case I provide significantly better prediction 
as measured by both DIG and LPML, with differences in the range of 20-55 
for DIG and 10-25 for LPML, which indicates that the determination of the 
cut-point vector for the baseline hazard plays an important role on model 
prediction and fit. Gomparing the frailty specifications in Model 1 across all 
cases, the DIG and LPML show the same trend for goodness of fit, with the 
proposed model based on the LDTFP frailty model outperforming both the 
MPT and Gaussian models, although the differences are only in the range of 
1-4. Gomparing between Model 2 and Model 3, the proposed model is always 
preferred in terms of LPML, while the MPT model is slightly better than 
others in term of DIG under Model 2. Gomparing all the proposed models 
across Model 1-Model 3, the results indicate that Model 1 always performs 
best. Overall, allowing the frailty distribution to change with county-level 
covariates (especially RUGG) does improve model prediction according to 
LPML. In what follows, we present the results under case 1. 

Table 4 presents posterior medians and equal-tailed 95% credible inter¬ 
vals (GI) for main effects (components of under Model 1-Model 3, with 
covariate-adjusted frailties, and compares the individual-level covariate ef¬ 
fects, that is, (^i,^ 2 )^ 3)5 to those obtained by Zhao, Hanson and Garlin 
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Table 4 

Iowa SEER data: Posterior medians (95% credible intervals) of fixed effects ^ from 

various models 


Predictor 

Model 1 

Model 2 

Model 3 

CAR 

Cox 

6 (Age) 

0.019 

0.020 

0.020 

0.018 

0.019 


(0.013, 0.025) 

(0.014, 0.026) 

(0.014, 0.026) 

(0.012, 0.025) (0.013, 0.025) 

(2 (Regional) 

0.27 

0.27 

0.27 

0.22 

0.30 


(0.03, 0.49) 

(0.03, 0.47) 

(0.05, 0.50) 

(0.01, 0.49) 

(0.08, 0.52) 

^3 (Distant) 

1.64 

1.67 

1.65 

1.65 

1.64 


(1.43, 1.88) 

(1.43, 1.89) 

(1.43, 1.89) 

(1.40, 1.93) 

(1.42, 1.87) 

(RUCC) 

-0.105 


-0.082 



( 

-0.185, -0.041) 

( 

-0.179, 0.011) 



(,X 2 (Income) 


0.042 

0.011 





(0.003, 0.084) ( 

-0.040, 0.066) 




(2009), under the standard nonfrailty Cox model and the Cox frailty model 
that has a MPT prior for the baseline survival, centered at the log-logistic 
family, and with conditionally autoregressive (CAR) county-level spatial 
frailties. The best fitting Cox model reported by Zhao, Hanson and Carlin 
(2009) has an LPML of —2226. Therefore, the pseudo Bayes factor for the 
proposed model versus the CAR model is ss 55, implying that the 

proposed model predicts about 55 times better than the model with CAR 
frailties. In addition, the proposed model offers a unique interpretation. The 
posterior medians and 95% CIs for all individual-level effects change little 
across the different versions of the proposed model, indicating that the Cox 
regression estimates are reasonably stable for these data, except for the esti¬ 
mate of “Regional stage,” for which the CAR model 95% Cl is much wider 
than those under the considered versions of the proposed model. This may 
be partly due to the benefit of including county-level covariates. The best 
model according to LPML, Model 1, indicates that all the individual-level 
effects are significant at the 0.05 level. Higher age at diagnosis increases the 
hazard within each county. For instance, women are about pg 1,46 

times more likely to die from breast cancer than those twenty years younger 
who have the same disease stage and live in the same county. Compared 
with women having local stage of disease, women of the same age and living 
in the same county are ~ 1.31 times more likely to die if their cancer 
is detected at the regional stage, and ~ 5.16 times more likely to die 
if detected at the distant stage. We additionally present the fixed effects 
under the marginal PH model (i.e., using the R function coxph with op¬ 
tion cluster) across Model l~Model 3 in Appendix E of the supplementary 
material [Zhou et al. (2015)]. Note that the coefficient estimates under the 
marginal PH model have population-averaged interpretations and cannot be 
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directly compared with those fitted from the proposed frailty PH model due 
to different model structures. 

Regarding the effect of county-level covariates, living in a higher median 
household income or urban counties is associated with poorer survival after 
a breast cancer diagnosis. For example, the results under Model 1 indicate 
that after controlling for individual covariates and county, the hazard rate of 
women living in urban counties (with RUCC = 2) will be ~ 2 times 

larger than that of women in rural counties (with RUCC = 9). The results 
under Model 2 imply that after controlling for individual covariates and frail¬ 
ties, women have about a 1.7 times larger hazard rate if they live in median 
household income counties of $35,301 compared with median household in¬ 
come of $23,354 (see also Figure 2). Under Model 3, the results indicate that 
when both the county-level covariates are included simultaneously, their in¬ 
dependent effects are attenuated, partly due to the multicollinearity between 
them (see the middle two plots in Figure 3). 

We obtain the fitted predictive frailty densities for both e* (median-zero) 
and e* -|- (full distribution) and survival curves for women with mean 
entry age 68.8 years and distant stage of disease who live in the counties with 
different levels of median household income or RUCC, under the different 
versions of the proposed model. The three levels are chosen from the 5%, 50% 
and 95% quantiles of each covariate value. The results are reported in Fig¬ 
ures 2 and 3. Under our best fitting. Model 1 (see left three plots in Figure 2), 
the results indicate that higher values of RUCC increase the frailty variance 
and suggest a non-Gaussian shape (upper); we also see overall higher frailty 
after mixing over the location shift (middle) and so poorer survival 

(lower) in urban counties. Increasing heterogeneity as ruralness increases 
under Model 1 translates into increasing association among those living in 
more rural counties versus urban. In Appendix E of the supplementary mate¬ 
rial [Zhou et al. (2015)], Kendall’s tau is computed and plotted as a function 
of RUCC for individuals with mean entry age 68.8 years and distant stage. 
Kendall’s tau increases by a factor of three as RUCC goes from 2 to 9. Note 
that under a traditional gamma frailty model the association is static. 

Under Model 2, the frailty densities only slightly change compared with 
Model 1, but we do see poorer survival in counties with higher median house¬ 
hold income. Figure 3 demonstrates that after adjusting individual covari¬ 
ates and median household income (right three plots), there is little effect of 
RUCC on either predictive frailty densities or survival curves; while after ad¬ 
justing for RUCC (left three plots), the effect of median household income 
is almost negligible. In Appendix E of the supplementary material [Zhou 
et al. (2015)], the survival curves are also compared with those obtained un¬ 
der the marginal PH model. Overall, the marginal PH model under-predicts 
survival time up to about 1 month, for example, it gives estimates of median 
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(c) (d) 




(e) 


(f) 


Fig. 2. Iowa SEER data: Eitted predictive frailty densities [panels (a) and (b)/, frailty 
densities with location shifts [panels (c) and {d)[ and survival curves [panels (e) and {!)[ 
for women with mean entry age 68.8 years and distant stage of disease from different 
county covariate levels under Model 1 [panels (a), (c) and {e)[ and Model 2 [panels (b), 
(d) and (f)/. In panels (a), (c) and (e), the results for KECC = 2, 5 and 9 are displayed 
as dashed, continuous and dotted lines, respectively. In panels (b), (d) and (f), the results 
for Income = 23.354, 29.176 and 35.301 are displayed as dashed, continuous and dotted 
lines, respectively. 
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(a) 



(c) 




(b) 



(d) 



(f) 


Fig. 3. Iowa SEER data: Eitted predictive frailty densities [panels (a) and (b)/, frailty 
densities with location shifts [panels (c) and {d)[ and survival curves [panels (e) and 
(f )[ for women with mean entry age 68.8 years and distant stage of disease from different 
county covariate levels under Model 3. In panels (a), (c) and (e), the results for RUCC = 2, 
5 and 9 are displayed as dashed, continuous and dotted lines, respectively. In panels (b), 
(d) and (f), the results for Income = 23.354, 29.176 and 35.301 are displayed as dashed, 
continuous and dotted lines, respectively. 
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survival a month less, compared with our proposed model for patients with 
mean entry age 68.8 years and distant stage of disease who live in the same 
county. This may be partly due to the fact that the marginal PH model aver¬ 
ages over the changing behavior of the frailty distribution over the ruralness 
measure. 

It is widely known that access to quality care and screening for breast can¬ 
cer is more readily available to those with greater financial means and/or 
those living in urban areas. Therefore, our findings of increased survival 
for poorer and more rural counties for this cohort are initially puzzling. 
However, hormone replacement therapy (HRT) increased about 150% in the 
1990s [Wysowski and Governale (2005)], after several observational studies 
linked HRT to prevention of osteoporosis and protection from heart disease. 
However, this increasing use of HRT abated suddenly in 2002, when the 
Women’s Health Initiative clinical trial linked HRT to aggressively invasive 
breast cancer [Rossouw et al. (2002)]. In fact, overall breast cancer incidence 
rates peaked in 1999. Between 2001 and 2004 overall invasive breast cancer 
incidence declined, but fell much more drastically among women living in 
urban versus rural counties, and among women living in low-poverty ver¬ 
sus high-poverty counties. Hausauer et al. (2009) attribute this discrepancy 
to greater use of postmenopausal estrogen/progestin hormone replacement 
therapy among more affluent women and/or women living in urban counties 
up until 2002, when the Women’s Health Initiative trial was stopped pre¬ 
maturely on May 31, 2002, according to Rossouw et al. (2002), “... because 
the test statistic for invasive breast cancer exceeded the stopping boundary 
for this adverse effect and the global index statistics supported risks exceed¬ 
ing benefits. It is plausible that increased risk (i.e., stochastically larger 
frailties) in more affluent and more urban counties has to do with a larger 
proportion of women being prescribed HRT in the late 1980s and 1990s. 
Further exploratory analyses on other cohorts of SEER Iowan breast cancer 
data (1975-1979, 1980-1984, 1985-1989 and 1990-1994) show a reversal of 
the effects of income and ruralness, agreeing with intuition. Paralleling our 
study, Krieger, Chen and Waterman (2010) used county-level census data 
on income and found rising and falling breast cancer incidence rates for the 
SEER data over the range 1992-2005 for Caucasian women living in high- 
income counties, which ^^mirrored the social patterning of hormone therapy 
use.'' 

In a longer follow-up study of the Women’s Health Initiative trial, Chle- 
bowski et al. (2010) found that those on estrogen plus progestin compared 
to placebo had about 25% higher incidence of invasive breast cancer. Among 
those diagnosed with breast cancer, the two treatment arms had similar his¬ 
tology, but the estrogen plus progestin group were 78% more likely to have 
cancers that had spread to lymph nodes than placebo, and the estrogen plus 
progestin group were about twice as likely to die from breast cancer versus 
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placebo. It would appear that hormone replacement therapy fortihed the vir¬ 
ulence of breast cancer, significantly increasing both incidence and mortality. 
This same study showed an impressive 7% one-year drop in incidence right 
after the Women’s Health Initiative study was prematurely stopped and the 
medical community warned of a possible link between hormone replacement 
therapy and breast cancer. 

4. Simulation studies. We performed a simulation study to assess the 
performance of the proposed model. The simulated data are also used to 
compare the proposed approach with existing models. Specifically, we con¬ 
sider the GF approach described in Section 2.1 and the positive stable frailty 
Cox model proposed by Liu, Kalbfleisch and Schaubel (2011). Under this 
latter model, the shape parameter is allowed to depend on cluster-level co¬ 
variates. In terms of our notation, they assumed that the conditional hazard 
function of Tjj is 

( 2 ) X{t\wij,Xi,ei) = Xoi(t)exp{wijli +a), 

where the baseline hazard functions Xoi{t) and regression parameters 
are cluster-specific, and exp(ei) follows a positive stable distribution with 
shape parameter a* G (0,1), relying on the cluster-level covariates vector x* 
through a logit link function, denoted by PS{ai). They did not deal with 
this model directly, but rather derived the marginal model 

(3) X{t\wij,Xi) = /io(t)exp(whT7) 

by imposing the restrictions r] = ai^^, HQ{t) = {Aoi(t)}"S where Hoit) = 
fo ho(s) ds and Aoi = Jq Aoi(s) ds. In other words, they essentially fitted the 
above marginal Cox model by maximizing the pseudo partial likelihood un¬ 
der the working independence assumption [Wei, Lin and Weissfeld (1989)], 
and then utilized the imposed constraints to estimate the parameters in 
the frailty model. Although they considered a more flexible conditional Cox 
model, they made many assumptions to get the marginal model, some of 
which are difficult to check in practice. Moreover, they faced a nonidentifia- 
bility problem when a cluster-level covariate was included in the conditional 
Cox model, so cluster-level covariates had to be excluded from the marginal 
model as well, leading to potentially poorer prediction of the marginal sur¬ 
vival function. Their method, referred to below as PSF, will be compared 
with our approach focusing on the prediction of survival functions in the 
simulation studies. A comparison of the two methods for the fixed effect es¬ 
timates cannot be conducted, since they have different model structures. 
We conducted the simulation study in R. The GF and PSF approaches 
were implemented by using the function coxme and coxph (with the option 
cluster), respectively, included in the R packages coxme and survival. 
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4.1. Simulation settings. Two scenarios for the frailty distributions were 
considered. In the first case, referred to as Scenario I, a covariate-dependent 
family of distributions is considered, where the density shape evolves from 
one mode to two as the cluster-specific covariate x increases its value; this 
mirrors the effect of RUCC in panel (a) of Figure 3 for Model 1. In the 
second case, referred to as Scenario II, a covariate-dependent positive stable 
distribution is considered. The specific distributional forms for each setting 
were the following: 

Scenario I. ei|xi0.5iV(-e°-^^% 1) + 0.5A^(e°-^^S 1), x* ' C/(-3,3). 

Scenario II. exp(ej)|xi PS{ai), ai = 1 /{1 + , Xi C/(0,2). 

Note that the first setting is not a particular case of the proposed model; 
the second setting, chosen from the simulation study of Liu, Kalbfleisch 
and Schaubel (2011), is included to evaluate the behavior of the proposed 
approach when the PSF model is correct. 

Given the frailties, the data under Scenario I were simulated from the 
conditional PH model (1) with Xo{t) = 1, Wjj = {wiij,W 2 ij-,Xi)' and ^ = 
= (1.0,0.5,1.0)'; the data under Scenario II were simulated from 
the PSF model (2) with Wjj = {wiij,W 2 ij)', ^7 = (1,0.5)' and H^it) = t. 
For each simulation scenario, 200 replicates of the data set were gener¬ 
ated by assuming the following: wuj ^"(0,1) and W2ij Bernoulli(0.5), 
i = 1,..., 100, j = 1,..., 10. In each case, a noninformative censoring scheme 
was considered, where the censoring times were simulated from an U (0.25,4) 
distribution, so that the censoring rate is approximately 35% under Scenario 
I and 25% under Scenario II. 

For each data set, the GF approach was employed, yielding point esti¬ 
mates of var(ej) and Cj, which we denote by and respec¬ 

tively. Based on these point estimates, the predictive survival function was 
calculated as follows: 

n 

(4) 5GF(t|w) =n“^^exp{-A[,°^(t)exp{w'^^°^ -Le®}}, 

i=l 

where Ag°^(t), depending on denotes Breslow’s estimator of Ao(t) [see, 

e.g., Therneau, Grambsch and Pankratz (2003), Section 2]. We then fit¬ 
ted the proposed model, by considering J = 4, AT = 10, ri = 1.001, T 2 = 
1.0010^^*^^ Oc = 1, hc = 1, 7 o = 0 i 3 and So = 10^ x I 13 . For each data set a 
single Markov chain of length 55,000 was obtained by using the algorithm de¬ 
scribed in Appendix B of the supplementary material [Zhou et al. (2015)]. A 
burn-in period of 5000 scans was considered, and 5000 samples were retained 
for posterior inferences. The posterior mean of the corresponding parame¬ 
ters are denoted by 0^, g{e\x) and A(t|w). Finally, the PSF approach was 
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considered but including the cluster-level covariates in the linear predictor, 
and the associated predictive survival function, based on Breslow’s estima¬ 
tor of the underlying baseline hazard function, was obtained and is denoted 
by 5psF(i|w). 

The competing approaches were compared regarding the estimation of 
the regression coefficients and also compared by computing the weighted 
integrated squared error (ISE) for the estimated survival distributions, given 
by 

{5m(t|w) - S{t\w)}^ fT{t\w) dt, 

where 5m(t|w), 5(^1 w) and /T(t|w) are the estimated survival function, the 
true survival function and density function, respectively, for a subject with 
covariate vector w. 

4.2. Simulation results. The results for the regression coefficients using 
the proposed model and the GF approach under Scenario I are given in 
Table 5, where the bias of the corresponding point estimators, the Monte 
Carlo mean of the posterior standard deviation/standard error (MEAN-SD), 
the Monte Carlo standard deviation of the point estimates (SD-MEAN) 
and the Monte Carlo coverage probability (CP) of the 95% credible inter¬ 
val/confidence intervals are presented. The results suggest that the posterior 
means of ^ are almost unbiased estimators and that the observed bias for 
under the proposed approach is much smaller than the corresponding value 
obtained under the GF approach. Moreover, under the proposed model, the 
MEAN-SD and the SD-MEAN values are in fairly close agreement, indicat¬ 
ing that the posterior standard deviation is an unbiased estimator of the 
frequentist standard error. Finally, the CPs are all around the nominal 95%. 



Table 5 

Simulation data—Scenario I; True value, bias of the point estimator, mean (across 
Monte Carlo simulations) of the posterior standard deviations/standard errors 
(MEAN-SD), standard deviation (aeross Monte Carlo simulations) of the point estimator 
(SD-MEAN) and Monte Carlo coverage probability for the 95% credible 
interval/confidence interval (CP) for the regression parameters. The results are presented 
under the proposed model and under the GF approach 


Para- 



Proposed model 


GF model 


meters 

True 

BIAS 

MEAN-SD SD-MEAN 

CP BIAS 

MEAN-SD SD-MEAN 

CP 

6 

1.0 

0.011 

0.052 

0.054 

0.930 -0.011 

0.051 

0.059 

0.910 

^2 

0.5 

0.008 

0.088 

0.090 

0.945 -0.003 

0.088 

0.091 

0.950 


1.0 

-0.009 

0.141 

0.126 

0.965 -0.052 

0.083 

0.142 

0.775 
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Table 6 

Simulated data—Scenario II; Monte Carlo mean (Monte Carlo standard deviation) for 
the ISE of the survival function for two different predictor values. The results for the 
different approaches under both simulation scenarios are presented. The numbers 
correspond to 10^ times the original values 


Scenario 

(■Wl,W2,x) 

Proposed model 

GF model 

PSF model 

I 

(2,1,-2) 

2.02 (2.48) 

4.37 (3.46) 

6.28 (3.49) 


(0,1,2) 

1.94 (2.53) 

10.5 (6.86) 

14.3 (10.9) 

II 

(2,1,0.5) 

3.17 (4.66) 

3.13 (3.33) 

2.19 (2.26) 


(0,1,1.5) 

0.96 (1.18) 

0.89 (1.22) 

0.83 (1.10) 


The same does not hold for GF, which substantially underestimates the 
standard error for leading to low coverage probabilities. 

The average of the estimated frailty distributions and survival functions 
across simulated data sets for some specific covariate values are presented 
in Figure 4 for Scenario I and in Figure 5 for Scenario II. The results in 
Scenario I reveal that the proposed model roughly captures the modal be¬ 
havior of the covariate-dependent frailty distributions. Although not per¬ 
fect, the proposed model performs remarkably well given that only n = 100 
imperfectly-observed observations were generated for each data set. The sit¬ 
uation is much less favorable for the GF approach, which fails to correctly 
capture the shape of the frailty distributions, leading to poor estimated 
survival functions. This behavior is likely driving the underestimation of 
survival noted in the SEER analysis. As expected, the PSF approach also 
suffers from bad prediction since the underlying assumption for frailty distri¬ 
bution is violated. The results in Scenario II show that the proposed model 
is still able to capture the frailty distributional shape even when the data 
were truly generated from the PSF model. Regarding the estimated survival 
curves, the results suggest that essentially no differences among the three 
methods are observed; all estimated functions are close to the truth, indi¬ 
cating that there is little price to be paid when using the proposed model for 
the clustered survival data that were truly generated from the PSF model. 

The results of the comparison of the estimated survival curves in terms 
of ISE are presented in Table 6, where the Monte Garlo mean and standard 
deviations for the ISE for two different predictor values are given. The results 
under Scenario I show a close agreement with the observed for the regression 
coefficients; the proposed model substantially outperforms the other two 
methods in terms of smaller means and standard deviations of the ISE. 
Even under Scenario II, the proposed model still provides almost the same 
results as the PSF model in terms of ISE. 

In Appendix D of the supplementary material [Zhou et al. (2015)], addi¬ 
tional simulation results are presented which show that, under Scenario I, for 
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Fig. 4. Simulated data—Scenario 1/ Mean, across simulations, of the posterior mean of 
the survival and frailty density functions under the proposed model. Panels (a) and (b) 
show the results for the survival functions. Panels (c) and (d) show the results for the 
frailty densities. Panels (a) and (c) show the results for covariate values (2,1,-2). Panels 
(b) and (d) show the results for covariate values (0,1, 2). The true curves are represented 
by continuous lines. The results under the proposed model are represented by dashed lines. 
The results under the exchangeable Gaussian frailty model are represented by dotted lines. 
In panels (a) and (b) the results obtained under the PSP approach are represented by 
dot-dashed lines. 


larger sample sizes better estimates of the frailty distributions are obtained 
and that the approach is not affected by the choice of J in the specification 
of the LDTFP model. For further comparison, we also fitted the exchange- 
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(a) (b) 



values 

(c) 


values 

(d) 


Fig. 5. Simulated data—Scenario II; Mean, across simulations, of the posterior mean 
of the survival and frailty density functions under the proposed model. Panels (a) and 
(b) show the results for the survival functions. Panels (c) and (d) show the results for 
the frailty densities. Panels (a) and (c) show the results for covariate values (2,1,0.5). 
Panels (b) and (d) show the results for covariate values (0,1,1.5). The true curves are 
represented by continuous lines. The results under the proposed model are represented by 
dashed lines. The results under the exchangeable Gaussian frailty model are represented 
by dotted lines. In panels (a) and (b) the results obtained under the PSF approach are 
represented by dot-dashed lines. 


able mixture of Polya trees (MPT) [Hanson (2006b)] frailty Cox’s model 
using the function PTglmm available in DPpackage [Jara et al. (2011)] under 
Scenario I, in which the results show that our approach outperforms the 
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MPT, and considered a third scenario favorable to the GP approach, where 
the results show that our method pays little price for the extra generality 
when using the proposed model when normality and exchangeability are 
valid assumptions. Overall, the proposed approach provides a flexible way 
to capture the heterogeneity in the frailty distribution, provides superior 
prediction, and yields an essential improvement for the estimation of pop¬ 
ulation effects, especially when the intra-cluster correlation (or variability 
in frailties) is relatively large. When the frailty variances are small across 
clusters, the proposed approach is still recommended due to its flexibility. 

5. Concluding remarks. Very limited work has been done on covariate- 
adjusted frailty survival models for clustered time-to-event data. Liu, 
Kalbfleisch and Schaubel (2011) proposed a stratified Cox model with posi¬ 
tive stable frailties, where the shape parameter of the frailty distribution is 
allowed to depend on cluster-level covariates. However, they essentially fit¬ 
ted a marginal Cox model, and then utilized the positive stable assumption 
and some imposed constraints to estimate the parameters in their proposed 
model. The model proposed in this paper cleanly separates population-level 
effects from the cluster-level effects, which determine the shape of the frailty 
distribution. Frailty density shape is modeled using a tractable median-zero 
LDTFP prior. Other nonparametric density regression approaches could also 
be considered; however, model identifiability requires a location constraint 
such as mean-zero or median-zero. The proposed model provides a natural 
generalization of the conventional PH model with parametric or nonpara¬ 
metric exchangeable frailties, and accommodates frailty distribution “evo¬ 
lution” over cluster-level covariates providing superior prediction, as shown 
in our simulation studies. When data are truly generated according to an 
exchangeable Gaussian frailty PH model or the model of Liu, Kalbfleisch 
and Schaubel (2011), our model does about the same as the underlying true 
model in terms of fixed effects and/or marginal survival estimations. We 
illustrate the usefulness of the proposed model with an analysis of a subset 
of the Iowa SEER breast cancer data, and demonstrate that higher degree 
of ruralness corresponds to a more bimodal frailty distributional shape with 
larger variance. In general, the proposed model is more flexible than cur¬ 
rently existing frailty PH models, leading to more robust inferences, and 
thus is recommended. One drawback of the proposed model is that, as cur¬ 
rently fit in R, obtaining inference takes longer. 

For ease of computation, we have assumed a piecewise constant structure 
for the baseline hazard function \o{t) and taken the independent normal 
prior distributions for log(Afc)’s, so that the baseline hazard heights and 
covariate effects ^ can be updated simultaneously. The use of empirically- 
derived cut-points has permeated much of the Bayesian survival literature 
for over a decade. Use of Breslow’s baseline estimate coupled with the GF 
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approach led to a greatly increased LPML over the empirical approach. An 
obvious extension of our current work is to employ a smoothed baseline, 
for example, using penalized B-splines [Hennerfeind, Brezger and Fahrmeir 
(2006)], the piecewise exponential with random cut-points [Sahu and Dey 
(2004)], MPT [Hanson (2006b)], etc. Any of these approaches could improve 
model fit and prediction, but cannot currently be fitted in the R software. 
We are currently working on extending the methodology in this paper to 
other survival models and smoothed baselines. 
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SUPPLEMENTARY MATERIAL 

Supplement to “Modeling county-level breast cancer survival data using 
a covariate-adjusted frailty proportional hazards model” 

(DOI: 10.1214/14-AOAS793SUPP; .pdf). In this online supplemental article 
we provide (A) technical details on the mixture of linear dependent tailfree 
processes, (B) a detailed description of the MCMC algorithm, (C) sample R 
code to analyze the SEER data, (D) additional simulation studies and (E) 
additional analysis of the SEER data. 
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